智能论文笔记

Object Detection and Tracking with Autonomous UAV

A. Huzeyfe Demir , Berke Yavas , Mehmet Yazici , Dogukan Aksu , M. Ali Aydin

分类：机器人 | 计算机视觉

2022-06-26

在本文中，在模拟环境中对战斗无人机（UAV）进行了建模。旋转翼无人机成功执行了各种任务，例如锁定目标，跟踪并与周围车辆共享相关数据。采用了不同的软件技术，例如API通信，地面控制站配置，自主运动算法，计算机视觉和深度学习。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Sparse Bayesian Lasso via a Variable-Coefficient $\ell_1$ Penalty

Nathan Wycoff , Ali Arab , Katharine M. Donato , Lisa O. Singh

分类： (统计)机器学习

2022-11-09

Modern statistical learning algorithms are capable of amazing flexibility, but struggle with interpretability. One possible solution is sparsity: making inference such that many of the parameters are estimated as being identically 0, which may be imposed through the use of nonsmooth penalties such as the $\ell_1$ penalty. However, the $\ell_1$ penalty introduces significant bias when high sparsity is desired. In this article, we retain the $\ell_1$ penalty, but define learnable penalty weights $\lambda_p$ endowed with hyperpriors. We start the article by investigating the optimization problem this poses, developing a proximal operator associated with the $\ell_1$ norm. We then study the theoretical properties of this variable-coefficient $\ell_1$ penalty in the context of penalized likelihood. Next, we investigate application of this penalty to Variational Bayes, developing a model we call the Sparse Bayesian Lasso which allows for behavior qualitatively like Lasso regression to be applied to arbitrary variational models. In simulation studies, this gives us the Uncertainty Quantification and low bias properties of simulation-based approaches with an order of magnitude less computation. Finally, we apply our methodology to a Bayesian lagged spatiotemporal regression model of internal displacement that occurred during the Iraqi Civil War of 2013-2017.

translated by 谷歌翻译

Automated MRI Field of View Prescription from Region of Interest Prediction by Intra-stack Attention Neural Network

Ke Lei , Ali B. Syed , Xucheng Zhu , John M. Pauly , Shreyas S. Vasanawala

分类：计算机视觉

2022-11-09

Manual prescription of the field of view (FOV) by MRI technologists is variable and prolongs the scanning process. Often, the FOV is too large or crops critical anatomy. We propose a deep-learning framework, trained by radiologists' supervision, for automating FOV prescription. An intra-stack shared feature extraction network and an attention network are used to process a stack of 2D image inputs to generate output scalars defining the location of a rectangular region of interest (ROI). The attention mechanism is used to make the model focus on the small number of informative slices in a stack. Then the smallest FOV that makes the neural network predicted ROI free of aliasing is calculated by an algebraic operation derived from MR sampling theory. We retrospectively collected 595 cases between February 2018 and February 2022. The framework's performance is examined quantitatively with intersection over union (IoU) and pixel error on position, and qualitatively with a reader study. We use the t-test for comparing quantitative results from all models and a radiologist. The proposed model achieves an average IoU of 0.867 and average ROI position error of 9.06 out of 512 pixels on 80 test cases, significantly better (P<0.05) than two baseline models and not significantly different from a radiologist (P>0.12). Finally, the FOV given by the proposed framework achieves an acceptance rate of 92% from an experienced radiologist.

translated by 谷歌翻译

MONAI: An open-source framework for deep learning in healthcare

M. Jorge Cardoso , Wenqi Li , Richard Brown , Nic Ma , Eric Kerfoot , Yiheng Wang , Benjamin Murrey , Andriy Myronenko , Can Zhao , Dong Yang

分类：机器学习 | 人工智能 | 计算机视觉

2022-11-04

Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.

translated by 谷歌翻译

Generalisability of deep learning models in low-resource imaging settings: A fetal ultrasound study in 5 African countries

Carla Sendra-Balcells , Víctor M. Campello , Jordina Torrents-Barrena , Yahya Ali Ahmed , Mustafa Elattar , Benard Ohene Botwe , Pempho Nyangulu , William Stones , Mohammed Ammar , Lamya Nawal Benamer

分类：计算机视觉

2022-09-20

大多数人工智能（AI）研究都集中在高收入国家，其中成像数据，IT基础设施和临床专业知识丰富。但是，在需要医学成像的有限资源环境中取得了较慢的进步。例如，在撒哈拉以南非洲，由于获得产前筛查的机会有限，围产期死亡率的率很高。在这些国家，可以实施AI模型，以帮助临床医生获得胎儿超声平面以诊断胎儿异常。到目前为止，已经提出了深度学习模型来识别标准的胎儿平面，但是没有证据表明它们能够概括获得高端超声设备和数据的中心。这项工作研究了不同的策略，以减少在高资源临床中心训练并转移到新的低资源中心的胎儿平面分类模型的域转移效果。为此，首先在丹麦的一个新中心对1,008例患者的新中心进行评估，接受了1,008名患者的新中心，后来对五个非洲中心（埃及，阿尔及利亚，乌干达，加纳和马拉维进行了相同的表现），首先在丹麦的一个新中心进行评估。）每个患者有25名。结果表明，转移学习方法可以是将小型非洲样本与发达国家现有的大规模数据库相结合的解决方案。特别是，该模型可以通过将召回率提高到0.92 \ pm 0.04 $，同时又可以维持高精度。该框架显示了在临床中心构建可概括的新AI模型的希望，该模型在具有挑战性和异质条件下获得的数据有限，并呼吁进行进一步的研究，以开发用于资源较少的国家 /地区的AI可用性的新解决方案。

translated by 谷歌翻译

Towards Improving Calibration in Object Detection Under Domain Shift

Muhammad Akhtar Munir , Muhammad Haris Khan , M. Saquib Sarfraz , Mohsen Ali

分类：计算机视觉

2022-09-15

在安全至关重要的应用中，深度神经网络的使用越来越多，就需要训练有素的模型。当前大多数校准技术解决了分类问题，同时着重于改善对内域预测的校准。在许多决策系统中占据相似的空间和重要性的视觉对象探测器的校准几乎没有关注。在本文中，我们研究了当前对象检测模型的校准，尤其是在域移位下。为此，我们首先引入了插件的火车时间校准损失以进行对象检测。它可以用作辅助损失函数，以改善检测器的校准。其次，我们设计了一种新的不确定性量化机制来进行对象检测，该机制可以隐式校准常用的基于自我训练的域自适应检测器。我们在研究中包括单阶段和两阶段对象探测器。我们证明，我们的损失改善了具有明显边缘的内域和室外检测的校准。最后，我们展示了我们技术在校准不同域移动方案中的域自适应对象探测器方面的实用性。

translated by 谷歌翻译

Revisiting Outer Optimization in Adversarial Training

Ali Dabouei , Fariborz Taherkhani , Sobhan Soleymani , Nasser M. Nasrabadi

分类：机器学习

2022-09-02

尽管对抗性和自然训练（AT和NT）之间有基本的区别，但在方法中，通常采用动量SGD（MSGD）进行外部优化。本文旨在通过研究AT中外部优化的忽视作用来分析此选择。我们的探索性评估表明，与NT相比，在诱导较高的梯度规范和方差。由于MSGD的收敛速率高度取决于梯度的方差，因此这种现象阻碍了AT的外部优化。为此，我们提出了一种称为ENGM的优化方法，该方法将每个输入示例对平均微型批次梯度的贡献进行正规化。我们证明ENGM的收敛速率与梯度的方差无关，因此适合AT。我们介绍了一种技巧，可以使用有关梯度范围W.R.T.规范的相关性的经验观察来降低ENGM的计算成本。网络参数和输入示例。我们对CIFAR-10，CIFAR-100和Tinyimagenet的广泛评估和消融研究表明，Engm及其变体一致地改善了广泛的AT方法的性能。此外，Engm减轻了AT的主要缺点，包括强大的过度拟合和对超参数设置的敏感性。

translated by 谷歌翻译

HTML版本

Monkeypox Skin Lesion Detection Using Deep Learning Models: A Feasibility Study

Shams Nafisa Ali , Md. Tazuddin Ahmed , Joydip Paul , Tasnim Jahan , S. M. Sakeef Sani , Nawsabah Noor , Taufiq Hasan

分类：计算机视觉 | 人工智能

2022-07-06

由于其在非洲以外的40多个国家 /地区的迅速传播，最近的蒙基托克斯爆发已成为公共卫生问题。由于与水痘和麻疹的相似之处，蒙基托斯在早期的临床诊断是具有挑战性的。如果不容易获得验证性聚合酶链反应（PCR）测试，那么计算机辅助检测蒙基氧基病变可能对可疑病例的监视和快速鉴定有益。只要有足够的训练示例，深度学习方法在自动检测皮肤病变中有效。但是，截至目前，此类数据集尚未用于猴蛋白酶疾病。在当前的研究中，我们首先开发``Monkeypox皮肤病变数据集（MSLD）。用于增加样本量，并建立了3倍的交叉验证实验。在下一步中，采用了几种预训练的深度学习模型，即VGG-16，Resnet50和InceptionV3用于对Monkeypox和Monkeypox和Monkeypox和其他疾病。还开发了三种型号的合奏。RESNET50达到了82.96美元（\ pm4.57 \％）$的最佳总体准确性，而VGG16和整体系统的准确性达到了81.48美元（\ pm6.87 \％）$和$ 79.26（\ pm1.05 \％）$。还开发了一个原型网络应用程序作为在线蒙基蛋白筛选工具。虽然该有限数据集的初始结果是有希望的，但需要更大的人口统计学多样化的数据集来进一步增强性增强性。这些的普遍性楷模。

translated by 谷歌翻译

Scalable Polar Code Construction for Successive Cancellation List Decoding: A Graph Neural Network-Based Approach

Yun Liao , Seyyed Ali Hashemi , Hengjie Yang , John M. Cioffi

分类：人工智能

2022-07-03

虽然可以通过对位渠道进行排序来有效地实现连续策略解码的极性代码，但以有效且可扩展的方式为连续策略列表（SCL）解码找到最佳的极性代码结构，但仍在等待研究。本文提出了一个基于图形神经网络（GNN）基于迭代消息通话（IMP）算法的强化算法，以解决SCL解码的极性代码构建问题。该算法仅在极地代码的生成器矩阵诱导的图的局部结构上运行。 IMP模型的大小独立于区块长度和代码速率，从而使其可扩展到具有长块长度的极性代码。此外，单个受过训练的IMP模型可以直接应用于广泛的目标区块长度，代码速率和渠道条件，并且可以生成相应的极性代码，而无需单独的训练。数值实验表明，IMP算法找到了极性代码构建体，这些构建体在环状划分 - 检查辅助辅助AD的SCL（CA-SCL）解码下显着优于经典构建体。与针对SCL/CA-SCL解码量身定制的其他基于学习的施工方法相比，IMP算法构建具有可比或较低帧错误率的极地代码，同时通过消除每个目标阻止长度的单独训练的需求，从而大大降低了训练的复杂性，代码速率和通道状况。

translated by 谷歌翻译